Goto

Collaborating Authors

 composition space


AIMatDesign: Knowledge-Augmented Reinforcement Learning for Inverse Materials Design under Data Scarcity

Yu, Yeyong, Bian, Xilei, Xiong, Jie, Wu, Xing, Qian, Quan

arXiv.org Artificial Intelligence

With the growing demand for novel materials, machine learning-driven inverse design methods face significant challenges in reconciling the high-dimensional materials composition space with limited experimental data. Existing approaches suffer from two major limitations: (I) machine learning models often lack reliability in high-dimensional spaces, leading to prediction biases during the design process; (II) these models fail to effectively incorporate domain expert knowledge, limiting their capacity to support knowledge-guided inverse design. To address these challenges, we introduce AIMatDesign, a reinforcement learning framework that addresses these limitations by augmenting experimental data using difference-based algorithms to build a trusted experience pool, accelerating model convergence. To enhance model reliability, an automated refinement strategy guided by large language models (LLMs) dynamically corrects prediction inconsistencies, reinforcing alignment between reward signals and state value functions. Additionally, a knowledge-based reward function leverages expert domain rules to improve stability and efficiency during training. Our experiments demonstrate that AIMatDesign significantly surpasses traditional machine learning and reinforcement learning methods in discovery efficiency, convergence speed, and success rates. Among the numerous candidates proposed by AIMatDesign, experimental synthesis of representative Zr-based alloys yielded a top-performing BMG with 1.7GPa yield strength and 10.2\% elongation, closely matching predictions. Moreover, the framework accurately captured the trend of yield strength variation with composition, demonstrating its reliability and potential for closed-loop materials discovery.


Jacobian-Scaled K-means Clustering for Physics-Informed Segmentation of Reacting Flows

Barwey, Shivam, Raman, Venkat

arXiv.org Artificial Intelligence

This work introduces Jacobian-scaled K-means (JSK-means) clustering, which is a physics-informed clustering strategy centered on the K-means framework. The method allows for the injection of underlying physical knowledge into the clustering procedure through a distance function modification: instead of leveraging conventional Euclidean distance vectors, the JSK-means procedure operates on distance vectors scaled by matrices obtained from dynamical system Jacobians evaluated at the cluster centroids. The goal of this work is to show how the JSK-means algorithm -- without modifying the input dataset -- produces clusters that capture regions of dynamical similarity, in that the clusters are redistributed towards high-sensitivity regions in phase space and are described by similarity in the source terms of samples instead of the samples themselves. The algorithm is demonstrated on a complex reacting flow simulation dataset (a channel detonation configuration), where the dynamics in the thermochemical composition space are known through the highly nonlinear and stiff Arrhenius-based chemical source terms. Interpretations of cluster partitions in both physical space and composition space reveal how JSK-means shifts clusters produced by standard K-means towards regions of high chemical sensitivity (e.g., towards regions of peak heat release rate near the detonation reaction zone). The findings presented here illustrate the benefits of utilizing Jacobian-scaled distances in clustering techniques, and the JSK-means method in particular displays promising potential for improving former partition-based modeling strategies in reacting flow (and other multi-physics) applications.


Materials Representation and Transfer Learning for Multi-Property Prediction

Kong, Shufeng, Guevarra, Dan, Gomes, Carla P., Gregoire, John M.

arXiv.org Artificial Intelligence

The adoption of machine learning in materials science has rapidly transformed materials property prediction. Hurdles limiting full capitalization of recent advancements in machine learning include the limited development of methods to learn the underlying interactions of multiple elements, as well as the relationships among multiple properties, to facilitate property prediction in new composition spaces. To address these issues, we introduce the Hierarchical Correlation Learning for Multi-property Prediction (H-CLMP) framework that seamlessly integrates (i) prediction using only a material's composition, (ii) learning and exploitation of correlations among target properties in multi-target regression, and (iii) leveraging training data from tangential domains via generative transfer learning. The model is demonstrated for prediction of spectral optical absorption of complex metal oxides spanning 69 3-cation metal oxide composition spaces. H-CLMP accurately predicts non-linear composition-property relationships in composition spaces for which no training data is available, which broadens the purview of machine learning to the discovery of materials with exceptional properties. This achievement results from the principled integration of latent embedding learning, property correlation learning, generative transfer learning, and attention models. The best performance is obtained using H-CLMP with Transfer learning (H-CLMP(T)) wherein a generative adversarial network is trained on computational density of states data and deployed in the target domain to augment prediction of optical absorption from composition. H-CLMP(T) aggregates multiple knowledge sources with a framework that is well-suited for multi-target regression across the physical sciences.


Unsupervised Phase Mapping of X-ray Diffraction Data by Nonnegative Matrix Factorization Integrated with Custom Clustering

Stanev, Valentin, Vesselinov, Velimir V., Kusne, A. Gilad, Antoszewski, Graham, Takeuchi, Ichiro, Alexandrov, Boian S.

arXiv.org Machine Learning

Analyzing large X-ray diffraction (XRD) datasets is a key step in high-throughput mapping of the compositional phase diagrams of combinatorial materials libraries. Optimizing and automating this task can help accelerate the process of discovery of materials with novel and desirable properties. Here, we report a new method for pattern analysis and phase extraction of XRD datasets. The method expands the Nonnegative Matrix Factorization method, which has been used previously to analyze such datasets, by combining it with custom clustering and cross-correlation algorithms. This new method is capable of robust determination of the number of basis patterns present in the data which, in turn, enables straightforward identification of any possible peak-shifted patterns. Peak-shifting arises due to continuous change in the lattice constants as a function of composition, and is ubiquitous in XRD datasets from composition spread libraries. Successful identification of the peak-shifted patterns allows proper quantification and classification of the basis XRD patterns, which is necessary in order to decipher the contribution of each unique single-phase structure to the multi-phase regions. The process can be utilized to determine accurately the compositional phase diagram of a system under study. The presented method is applied to one synthetic and one experimental dataset, and demonstrates robust accuracy and identification abilities.